NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Through the Citizen Scientists’ Eyes: Insights into Using Citizen Science with Machine Learning for Effective Identification of Unknown-Unknowns in Big Data

https://doi.org/10.5334/cstp.740

Mantha, Kameswara Bharadwaj; Roberts, Hayley; Fortson, Lucy; Lintott, Chris; Dickinson, Hugh; Keel, William; Sankar, Ramanakumar; Krawczyk, Coleman; Simmons, Brooke; Walmsley, Mike; et al (December 2024, Citizen Science: Theory and Practice)
Fortson, Lucy; Crowston, Kevin; Kloetzer, Laure; Ponti, Marisa (Ed.)
In the era of rapidly growing astronomical data, the gap between data collection and analysis is a significant barrier, especially for teams searching for rare scientific objects. Although machine learning (ML) can quickly parse large data sets, it struggles to robustly identify scientifically interesting objects, a task at which humans excel. Human-in-the-loop (HITL) strategies that combine the strengths of citizen science (CS) and ML offer a promising solution, but first, we need to better understand the relationship between human- and machine-identified samples. In this work, we present a case study from the Galaxy Zoo: Weird & Wonderful project, where volunteers inspected ~200,000 astronomical images—processed by an ML-based anomaly detection model—to identify those with unusual or interesting characteristics. Volunteer-selected images with common astrophysical characteristics had higher consensus, while rarer or more complex ones had lower consensus. This suggests low-consensus choices shouldn’t be dismissed in further explorations. Additionally, volunteers were better at filtering out uninteresting anomalies, such as image artifacts, which the machine struggled with. We also found that a higher ML-generated anomaly score that indicates images’ low-level feature anomalousness was a better predictor of the volunteers’ consensus choice. Combining a locus of high volunteer-consensus images within the ML learnt feature space and anomaly score, we demonstrated a decision boundary that can effectively isolate images with unusual and potentially scientifically interesting characteristics. Using this case study, we lay important guidelines for future research studies looking to adapt and operationalize human-machine collaborative frameworks for efficient anomaly detection in big data.
more » « less
Full Text Available
Communicating the gravitational-wave discoveries of the LIGO-Virgo-KAGRA Collaboration

https://doi.org/10.22323/2.23070803

Middleton, Hannah; Berry, Christopher_P L; Arnaud, Nicolas; Blair, David; Bondell, Jacqueline; Bonino, Alice; Bonne, Nicolas; Chatterjee, Debarati; Chaty, Sylvain; Colloms, Storm; et al (October 2024, Journal of Science Communication)

The LIGO-Virgo-KAGRA (LVK) Collaboration has made breakthrough discoveries in gravitational-wave astronomy, a new field that provides a different means of observing our Universe. Gravitational-wave discoveries are possible thanks to the work of thousands of people from across the globe working together. In this article, we discuss the range of engagement activities used to communicate LVK gravitational-wave discoveries and the stories of the people behind the science, using the activities surrounding the release of the third Gravitational-Wave Transient Catalog as a case study.
more » « less
Full Text Available
Galaxy Zoo: Clump Scout – Design and first application of a two-dimensional aggregation tool for citizen science

https://doi.org/10.1093/mnras/stac2919

Dickinson, Hugh; Adams, Dominic; Mehta, Vihang; Scarlata, Claudia; Fortson, Lucy; Serjeant, Stephen; Krawczyk, Coleman; Kruk, Sandor; Lintott, Chris; Mantha, Kameswara Bharadwaj; et al (October 2022, Monthly Notices of the Royal Astronomical Society)

ABSTRACT Galaxy Zoo: Clump Scout is a web-based citizen science project designed to identify and spatially locate giant star forming clumps in galaxies that were imaged by the Sloan Digital Sky Survey Legacy Survey. We present a statistically driven software framework that is designed to aggregate two-dimensional annotations of clump locations provided by multiple independent Galaxy Zoo: Clump Scout volunteers and generate a consensus label that identifies the locations of probable clumps within each galaxy. The statistical model our framework is based on allows us to assign false-positive probabilities to each of the clumps we identify, to estimate the skill levels of each of the volunteers who contribute to Galaxy Zoo: Clump Scout and also to quantitatively assess the reliability of the consensus labels that are derived for each subject. We apply our framework to a data set containing 3561 454 two-dimensional points, which constitute 1739 259 annotations of 85 286 distinct subjects provided by 20 999 volunteers. Using this data set, we identify 128 100 potential clumps distributed among 44 126 galaxies. This data set can be used to study the prevalence and demographics of giant star forming clumps in low-redshift galaxies. The code for our aggregation software framework is publicly available at: https://github.com/ou-astrophysics/BoxAggregator
more » « less
Galaxy Zoo DECaLS: Detailed visual morphology measurements from volunteers and deep learning for 314 000 galaxies

https://doi.org/10.1093/mnras/stab2093

Walmsley, Mike; Lintott, Chris; Géron, Tobias; Kruk, Sandor; Krawczyk, Coleman; Willett, Kyle W; Bamford, Steven; Kelvin, Lee S; Fortson, Lucy; Gal, Yarin; et al (December 2021, Monthly Notices of the Royal Astronomical Society)

ABSTRACT We present Galaxy Zoo DECaLS: detailed visual morphological classifications for Dark Energy Camera Legacy Survey images of galaxies within the SDSS DR8 footprint. Deeper DECaLS images (r = 23.6 versus r = 22.2 from SDSS) reveal spiral arms, weak bars, and tidal features not previously visible in SDSS imaging. To best exploit the greater depth of DECaLS images, volunteers select from a new set of answers designed to improve our sensitivity to mergers and bars. Galaxy Zoo volunteers provide 7.5 million individual classifications over 314 000 galaxies. 140 000 galaxies receive at least 30 classifications, sufficient to accurately measure detailed morphology like bars, and the remainder receive approximately 5. All classifications are used to train an ensemble of Bayesian convolutional neural networks (a state-of-the-art deep learning method) to predict posteriors for the detailed morphology of all 314 000 galaxies. We use active learning to focus our volunteer effort on the galaxies which, if labelled, would be most informative for training our ensemble. When measured against confident volunteer classifications, the trained networks are approximately 99 per cent accurate on every question. Morphology is a fundamental feature of every galaxy; our human and machine classifications are an accurate and detailed resource for understanding how galaxies evolve.
more » « less
Full Text Available
The Fifteenth Data Release of the Sloan Digital Sky Surveys: First Release of MaNGA-derived Quantities, Data Visualization Tools, and Stellar Library

https://doi.org/10.3847/1538-4365/aaf651

Aguado, D. S.; Ahumada, Romina; Almeida, Andrés; Anderson, Scott F.; Andrews, Brett H.; Anguiano, Borja; Ortíz, Erik Aquino; Aragón-Salamanca, Alfonso; Argudo-Fernández, Maria; Aubert, Marie; et al (February 2019, The Astrophysical Journal Supplement Series)

Full Text Available

Search for: All records